Distance Measures for Sequences

نویسنده

  • Sandeep Hosangadi
چکیده

Given a set of sequences, the distance between pairs of them helps us to find their similarity and derive structural relationship amongst them. For genomic sequences such measures make it possible to construct the evolution tree of organisms. In this paper we compare several distance measures and examine a method that involves circular shifting one sequence against the other for finding good alignment to minimize Hamming distance. We also use run-length encoding together with LZ77 to characterize information in a binary sequence.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

New distance and similarity measures for hesitant fuzzy soft sets

The hesitant fuzzy soft set (HFSS), as a combination of hesitant fuzzy and soft sets, is regarded as a useful tool for dealing with the uncertainty and ambiguity of real-world problems. In HFSSs, each element is defined in terms of several parameters with arbitrary membership degrees. In addition, distance and similarity measures are considered as the important tools in different areas such as ...

متن کامل

Several new results based on the study of distance measures of intuitionistic fuzzy sets

It is doubtless that intuitionistic fuzzy set (IFS) theory plays an increasingly important role in solving the problems under uncertain situation. As one of the most critical members in the theory, distance measure is widely used in many aspects. Nevertheless, it is a pity that part of the existing distance measures has some drawbacks in practical significance and accuracy. To make up for their...

متن کامل

Study on Phylogenetic Relationship among some of Iranian Wild Almond Species using Sequences of ITS1-5.8S rDNA-ITS2 Region and Chloroplastic trnL

Phylogenetic relations among 12 wild species of almonds, one cultivated almond and one species of peach were investigated by using of ITS1-5.8S rDNA-ITS2 sequences and trnL region of chloroplast DNA. To do this, maximum-parsimony and neighbor joining analysis adopted. Results of ITS data showed that studied species of Prunus only divided in two groups but incapable to separate different section...

متن کامل

Formal Languages and Algorithms for Similarity Based Retrieval from Sequence Databases

The paper considers various formalisms based on Automata, Temporal Logic and Regular expressions for specifying queries over finite sequences. Unlike traditional semantics that associate true or f alse value denoting whether a sequence satisfies a query, the paper presents distance measures that associate a value in the interval [0, 1] with a sequence and a query, denoting how closely the seque...

متن کامل

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1208.5713  شماره 

صفحات  -

تاریخ انتشار 2012